Digitizing a Million Books: Challenges for Document Analysis

نویسندگان

  • K. Pramod Sankar
  • Vamshi Ambati
  • Lakshmi Pratha
  • C. V. Jawahar
چکیده

This paper describes the challenges for document image analysis community for building large digital libraries with diverse document categories.The challengesare identified fromthe experienceof theon-going activities toward digitizing and archiving onemillion books. Smooth workflow has been established for archiving large quantity of books, with the help of efficient imageprocessing algorithms.However,muchmore research is needed to address the challenges arising out of the diversity of the content in digital libraries.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Mass Digitization Primer

Many people are talking these days about “digitizing books.” But what does that really mean? This paper describes different kinds of digitizing, the pros and cons of each, and suggests a layered structure for understanding “digitization.” The Digital Age has brought many challenges for librarians. Most obvious are all the issues concerning “born-digital” material that academics and the general ...

متن کامل

A policy framework for the challenges of implementing regional higher education management in Iran

The models of regional governance in the world, particularly for administration of higher education are considered vital. In Iran, with the approval of Iran's Higher Education System Spatial Management Document, the issue of regional management in higher education was given special attention. Articles 1 and 2 of the document specifically address the regional higher education structure of the ...

متن کامل

An Analysis of Ministry of Education’s Strategic Plans Based on Favorable Components of English Language Teaching Using Shannon’s Entropy

The present research aims to analyze the content of Ministry of Education’s strategic plans (the Fundamental Reform Document of Education, the Comprehensive National Scientific Plan and the National Curriculum Document) based on Shannon's entropy regarding the favorable components of teaching English. The contents of the Fundamental Reform Document of Education, the Comprehensive National Scien...

متن کامل

The Personalized Services in CADAL Digital Library

CADAL is a great digital library project of digitizing one million digital books and publishing them to the internet users. It’s obvious that users confront with the information overload problem when visiting the CADAL portal. Therefore, we have been concerned with providing useful and flexible personalization services to reduce the users’ time and energy cost of finding interesting information...

متن کامل

Slicing Books – The Authors' Perspective

While authors are still struggling to understand how to make best use of the potential offered by hypertext documents, computer science research proceeds to develop the next generation of the digital documents. This next generation will be based on richer semantics, more potential for automation and personalization and will pose new challenges to the authors. This paper aims to give a first imp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006